Skip to content

Conversation

@justinsb
Copy link
Member

@justinsb justinsb commented Sep 30, 2025

Less hacky support for GCP, encode more of the logic into controllers.

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Sep 30, 2025
@k8s-ci-robot k8s-ci-robot requested a review from zetaab September 30, 2025 22:42
@justinsb justinsb force-pushed the clusterapi_controllers branch 2 times, most recently from c2c7c26 to 5e7c5bb Compare September 30, 2025 22:55
@hakman hakman requested review from hakman and removed request for olemarkus and zetaab September 30, 2025 23:27
@justinsb justinsb force-pushed the clusterapi_controllers branch 4 times, most recently from 80b82ec to 855ef49 Compare October 6, 2025 16:43
@k8s-ci-robot k8s-ci-robot added the area/provider/gcp Issues or PRs related to gcp provider label Oct 6, 2025
@justinsb justinsb force-pushed the clusterapi_controllers branch 2 times, most recently from 27f468b to 3e80afa Compare October 7, 2025 16:22
@justinsb
Copy link
Member Author

justinsb commented Oct 7, 2025

/test pull-kops-scenario-clusterapi-gcp

Let's try this new test :-)

@hakman
Copy link
Member

hakman commented Oct 7, 2025

/test pull-kops-scenario-clusterapi-gcp

@hakman hakman force-pushed the clusterapi_controllers branch from b8bc822 to a4fbb3d Compare October 7, 2025 18:06
@hakman
Copy link
Member

hakman commented Oct 7, 2025

/test pull-kops-scenario-clusterapi-gcp

@hakman hakman force-pushed the clusterapi_controllers branch from a4fbb3d to 5496816 Compare October 7, 2025 18:08
@hakman
Copy link
Member

hakman commented Oct 7, 2025

/test pull-kops-scenario-clusterapi-gcp

@justinsb justinsb force-pushed the clusterapi_controllers branch 2 times, most recently from a3ef085 to fb68367 Compare October 7, 2025 22:26
@justinsb
Copy link
Member Author

justinsb commented Oct 7, 2025

/test pull-kops-scenario-clusterapi-gcp

@justinsb justinsb force-pushed the clusterapi_controllers branch from fb68367 to b7f6dc3 Compare October 7, 2025 22:58
@justinsb
Copy link
Member Author

justinsb commented Oct 7, 2025

/test pull-kops-scenario-clusterapi-gcp

@justinsb
Copy link
Member Author

/test pull-kops-scenario-clusterapi-gcp

Doh! Looks like the bare-metal tasks were running out of disk space - finally spotted the warning message! Found a script from the Apache Flink project (Apache licensed) and adapted it - e.g. we want to keep gcloud / azure-cli etc!

@ameukam
Copy link
Member

ameukam commented Oct 24, 2025

/test pull-kops-scenario-clusterapi-gcp

Doh! Looks like the bare-metal tasks were running out of disk space - finally spotted the warning message! Found a script from the Apache Flink project (Apache licensed) and adapted it - e.g. we want to keep gcloud / azure-cli etc!

@justinsb unrelated but you can now run those bare-metal tests in prow

@justinsb
Copy link
Member Author

@justinsb unrelated but you can now run those bare-metal tests in prow

Well, not entirely unrelated given the pain :-). I will look into that!

@ameukam
Copy link
Member

ameukam commented Oct 24, 2025

@justinsb unrelated but you can now run those bare-metal tests in prow

Well, not entirely unrelated given the pain :-). I will look into that!

Just make sure you pick the right nodepool for the nodeSelector:

gcloud container node-pools list --cluster=prow-build --project k8s-infra-prow-build --region us-central1 --format='value(name,config.advancedMachineFeatures)'
pool6-20251021213510079600000001        enableNestedVirtualization=True
pool7-arm64-20251021220031252300000001
pool7-20251021221638051700000001

Also, GKE Node could include a label cloud.google.com/gke-nested-virtualization=true 🤷🏾

justinsb and others added 12 commits October 24, 2025 11:56
The bare-metal tests were running out of disk space,
which leads to confusing errors.
As we're struggling with disk space, we don't want to install
all the recommended packages.
Less hacky support for GCP, encode more of the logic into controllers.

Co-authored-by: Ciprian Hacman <[email protected]>
Not the cleanest presentation, but this is the thing that is causing the most trouble right now.
We are unifying our asset-building paths, and the new path does not (yet)
support filtering architectures inside kops-controller.
There's no logging from systemctl restart,
so waiting for it to complete actually makes diagnostics harder.
@justinsb justinsb force-pushed the clusterapi_controllers branch from a4ef300 to 7868e58 Compare October 24, 2025 15:56
@justinsb
Copy link
Member Author

/test pull-kops-scenario-clusterapi-gcp

OK, looks like bare-metal is the disk space. So now did I break CAPI while trying to fix bare-metal? :-)

@hakman
Copy link
Member

hakman commented Oct 24, 2025

/lgtm
/approve
/hold in case you want to add something else

@k8s-ci-robot k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm "Looks good to me", indicates that a PR is ready to be merged. labels Oct 24, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hakman

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 24, 2025
@justinsb
Copy link
Member Author

Awesome. This has been a journey so I think we should unhold and iterate! Thanks for reviewing!

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 24, 2025
@hakman
Copy link
Member

hakman commented Oct 24, 2025

Amazing progress, and quite fun to dig into CAPI.

@k8s-ci-robot k8s-ci-robot merged commit 8f61cc0 into kubernetes:master Oct 24, 2025
28 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v1.35 milestone Oct 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/addons area/api area/documentation area/kops-controller area/nodeup area/provider/azure Issues or PRs related to azure provider area/provider/gcp Issues or PRs related to gcp provider cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants